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1.0 INTRODUCTION 


In many problems of quantum chemistry, nuclear and plasma physics, and 
economics, one may encounter a random process in which the variables are not 
independent and which can have a discrete, continuous, or mixed spectrum. In 
this paper, an algorithm is presented which allows a computer simulation of 
any such random process of real random variables if their joint 

distribution is known. Examples are presented to illustrate the theory. 

Mathematical tools used in developing the theory are based on Kolmogorov's 
fundamental paper on probability theory (ref. 9) and some results of Halmos 
(ref. 8), Neveu (ref. 10), and Bogdanowicz (refs. 2 through 6) on measure and 
integration theory. 

Readers who are interested in applications only should concentrate on 
sections 1 through 4 and 8 through 13 and read the remaining sections as 
needed to understand the principles. Knowledge of the Lebesgue integral with 
respect to an abstract measure is essential to understanding the proofs. The 
use of Dirac's delta function is helpful in applications. 

The theoretical results are formulated in terms of Borel functions ; that 
is, functions measurable with respect to the smallest sigma ring containing 
all cubes. As established by Halmos (ref. 8), this class of functions coin- 
cides for R^ spaces with the class of Baire functions ; that is, the small- 
est class which is closed under the sequential limit and contains all continu- 
ous functions. The importance of Baire functions in the general theory of 
random processes is presented in reference 1. 


2 . 0 COMPUTERIZATION OPERATOR 


Let F be the probability distribution of a real random variable x; 

i.e., F(a) = P{x < a} for all a e R. Such a function has the following 
properties : 

a. F is nondecreasing on R. 

b. F is left side continuous on R. 

c. F(-a>) = 0 and F(<») = 1. 

These properties characterize distributions of real random variables 
according to Kolmogorov's theorem; i.e., if F has properties a, b, and c, 
then there exists a probability space and a random variable x over it such 
that F is its probability distribution. 
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If a distribution F is absolutely continuous in the Lebesgue sense, 
then there exists a Lebesgue summable function f on R such that 
a 

p(a) = J f(t)dt for all a e R. Such a function f is called the density 

— oo 

of the distribution F. 

In many applications, one encounters distributions that do not have 
Lebesgue summable densities. For example, for x e R, let 

f(x) = pi 6 (x - xj) + P 26 (x - X 2 > + ... + Pn5(x - x^) ( 2 - 1 ) 


where 


0 < Pi 

Pi + . . . + Pn = 1 

6 (x) denotes Dirac's delta function; i.e., the formal density of the 
distribution h given by the formula 

h(a) =0 if a < 0 

h(a) =1 if a > 0 


Then the distribution F corresponding to the function f is given by the 
formula 


F ( a ) p 2 _h (a ... Piij^h ( a ) 


(2-2) 


for all a e R. Such a distribution has jumps at the points x^, x^ 
and is constant between them. Hence, F is not an absolutely continuous 
function. A random variable x corresponding to such a distribution has 
only discrete states X]^, x^. 

However, in some applications, one encounters random variables with 
discrete and continuous states. For example, consider a random variable 
with a density 


g(x) = px 6 (x - x^) + P 2 h(x)e ^ 


( 2 - 3 ) 
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where 


0 < Pi 
PI + P2 = 1 

h is the distribution of 6 


Such a density could appear in a steady state process x representing the 
energy level of a particle if the source emits particles having the specific 
energy level x = X]^ with probability p]^ and having other energy levels 
X X][ with joint probability P2» 

To simulate random variables with mixed states, it is convenient to in- 
troduce the computerization operator c mapping a distribution F into a 
function G over the open unit interval (0,1). This function G is de- 
fined by the formula G(u) = inf{x e R: u < F(x)} for all u e (0,1).. 


2.1 THEOREM 


The computerization operator c is well defined for every function F 
that is a distribution; i.e., that has properties a, b, and c. If the vari- 
able u has a uniform distribution on the open interval (0,1), then the 
variable x = G(u) has a probability distribution equal to the function F. 

Proof . Take any u e (0,1). It follows from property c that there are 
two points xj and X 2 , such that F(x]^) < u < F(x 2)* This implies that the 
set A(u) = {x e R: u < F(x)} is nonempty. It follows from property a 
that the number X]^ is a lower bound of the set A(u). From the axiom of 
continuity, it follows that the function G(u) = inf A(u) is well defined. 

To compute the distribution of the variable x = G(u), consider the set 
H(a) = {u e (0,1): G(u) < a} for a fixed a e R. 

It follows from the property of infimum that u e H(a) if and only if 
there exists an x C R such that x < a and u < F(x). Thus, introducing 
the set {u e (0,1): u < F(x)} = (0,F(x)), we get H(a) = Ux<a(0 Cx) ) . 

Since the function F is continuous from the left, the union of the inter- 
vals (0,F(x)) over all x < a is equal to the interval (0,F(a)). Thus, 
H(a) = (0,F(a)) for all a e R. Since the probability that a uniformly dis- 
tributed variable on the interval (0,1) falls into an interval I being 
a subinterval of (0,1) is equal to the length of that interval, we get 
P {x < a} = P(H(a)) = P((0,F(a))) = F(a) for all a £ R. 
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2.2 REMARK 


For every distribution F, the function G = c(F) has the following 
properties : 

1. G is nondecreasing on (0,1). 

2. G is right side continuous on (0,1). 

3. If u e (F(x), F(x+)), when F has a jump at x, then G(u) = F(X+). 

4. If F is strictly increasing and continuous on a closed interval 
<c,d>, then G(u) = x for u e <G(c), G(d)> if and only if u = F(x). 

These properties of the function G allow one to find the graph of the 
function G from the graph of the distribution function F by the following 
steps : 


1. Fill all vertical jtimps in the graph of the function u = F(x) with 
linear segments. 

2. Treat the u-axis as the axis of the independent variable and the 
x-axis as that of the dependent variable. 

3. At points u where there are several values of x such that 
u = F(x), define G so that G(u) = G(u+). 

3.0 SIMULATION OF A SINGLE RANDOM VARIABLE 


Since random number generators available on computers simulate a random 
variable u with uniform distribution on the interval (0,1), one may simu- 
late with good accuracy the distribution F of a random variable whose com- 
puterization G is a piecewise continuous function. Many distributions 
appearing in applied problems fall into this category. 


3.1 EXAMPLE 


Assume that each call to the Fortran function RAN(O) gives a different 
random number u in the interval (0,1). Write a segment of a Fortran pro- 
gram to generate N = 31 random values of a variable with the density 
g(x) = 0.56(x + 1) + 0.5h(x)e”^, x e R, where 6 and h are defined as 
before. 
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Solution , Computing the distribution function from the density g by 
the formula F(x) = J g(t)dt for all x e: r, we get F(x) = 0.5h(x + 1) 

_oo 

+ 0.5h(x)(l - e~^). The graph of the function F is given in the following 
diagram. 



From this diagram, we get the formula for the computerization G 
of distribution F using remark 2.2. This yields 


G(u) 

= -1 

if 0 < u' < 0.5 

G(u) 

= -log(2 - 2u) 

if 0.5 < u < 1 


(3-1) 


Thus, the segment of the program to simulate the distribution F may look 
as follows; 

DIMENSION X(IOO) 

N = 31 

DO 5 I = 1, N 
X(I) = G(RAN(0)) 

5 CONTINUE 


FUNCTION G(U) 

IF (O.O.LT.U.AND.U.LT.0.5) G = -1. 

IF (0.5.LE.U. AND.U.LT.l.) G = -ALOG(2.-2.*U) 

RETURN 

END 
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The array X(l) for I - 1 to 31 will contain a random sample of a variable 
whose distribution is given by the function F. 


3.2 EXAMPLE 


Find the computerization G for a random variable whose density func- 
tion is 


f(x) = (1/2) cos X 
f(x) = 0 


if -TT/2 < X < TT/2 
if 1x1 > TT/2 


(3-2) 


Solution . The distribution F of the function f is given by the 
formula 


F(x) = (1/2) (sin X + 1) 

if 

|xl < TT/2 ) 



o 

II 

X 

if 

X < -TT/2 

► 

(3-3) 

F(x) = 1 

if 

X > tt/2 ; 




Since the function F is continuous on the closed interval <-7T/2, TT/2> 
and maps it onto the interval <0,1>, we can use the equivalence of x = G(u) 
with u = F(x); i.e., u = (l/2)(sin x + 1). Solving this equation for x, 
we get X = arc sin (2u - 1). Thus, the computerization G is given by 
G(u) = arc sin (2u - 1) for u e (0,1). 


4.0 SIMULATION OF INDEPENDENT RANDOM VARIABLES 


In many applications, one has to investigate processes consisting of 
several random variables f^, f 2 ? •••? fk* Such variables are called 
independent if their joint distribution function F defined by 


F(ai, 

a2> 

« • • , 

ak) ^1^1 ^ ^1j •••> fk ^ 

(4-1) 

for all (aj, 

. • • , 

ak) 

e R^, can be represented in the form 


F(ai, 


. • . , 

^k) “ Fx(ai)F2(a2). • -Fk(ak) 

(4-2) 

for all (ax> 

• • . , 

ak) 

e 
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Thus, to simulate such a process it is enough to find the computeriza- 
tions = c(Fi) for i = 1, k. Then, if uj^, u^^ are k in- 

dependent random variables each having uniform distribution on the interval 
(0,1), the variables = Gj^Cuj), X 2 = G 2 (u 2 )> •••> ~ will have 

the distribution given by the function F* 


4.1 PROBLEM 

Let p = (X, (j)) be a random point on the sphere 

S = {(x, y, z): x^ + y2 + z2 = (4-3) 


where X and (|) are its spherical coordinates. Simulate the uniform dis- 
tribution on the sphere S. 

Solution . The probability density f at the point p e S is given by 
the formula 


1 

f (p) = for p e S 

4tt 


(4-4) 


Using spherical coordinates, we can write a representation of the set 

S as 


S = {p = (X, (j)) : 


0 < X < 2tt, 


TT 7T, 

- T < 

2 2 


(4-5) 


where we have neglected sets of measure zero; i.e., the poles and one merid- 
ian. The density function in these coordinates will have the form 


f(X, (|)) = fi(X)f2((|)), (X, <|)) e S 


(4-6) 


where 


fl(X) = 1/(2tt) 

1 

£2(4*) ~ cos <|) 


for 

all 

X e 

(0,2ir) 

for 

all 

<|> e 

ir 

'-T- 
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Thus, the joint distribution F of the variables X, (|) is given by 


^ 1 ^2 

F(ai, 32) = / fi(X)dX / f 2 (<l>)d<t> = Fi(ai)F 2 (a 2 ) (4-7) 


for all (a]^, a 2 ) ^ where F]^ is the distribution of X and F 2 the 
distribution of 4*. Since the joint distribution is a product of the two 
distributions, the random variables X, (j) are independent. Moreover, 


Fi(ai ) = — ai 
^ ^ 2TT 

if 

0 < ai < 2ir ' 



Fi(ai) = 0 

if 

3]^ < 0 


(4-8) 

Fi(ai) = 1 

if 

ai > 2TT > 



The distribution F 2 

was 

discussed in example 3.2. 

Thus, computer iza' 


tions of the distributions F][ and F 2 are given by 

Gi(ui) = 27Tui (4-9) 

^^^2 ~ (4-10) 

Hence, the variables X = 2TTu]^ and <|) = arc sin (2u2 " D> where u]^, U2 are 
independent uniform random variables on the interval (0,1), will simulate a 
uniform distribution of points on the sphere S. 


5.0 INTEGRAL PROPERTY OF THE COMPUTERIZATION OPERATOR 


Let F be the distribution of a real random variable; i.e., F satis- 
fies conditions a, b, c of section 2. Let G = c(F) be the computerization 
of the distribution F. Denote by V the prering (see ref. 2) consisting 
of all intervals I of the form <a,b), (-°°, a), <a,^), where a and b 
are real numbers. Define a set function v on V by the formula 


v<a,b) = F(b) - F(a) ^ 

v(-°°,b) = F(b) - F(-«>) = F(b) 
v<a,«) = F(“) - F(a) = 1 - F(a) > 


(5-1) 
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We can prove that the set function v is countably additive on V 
and thus forms a volume in the sense of Bogdanowicz (ref. 2). Following 
the development of paper 2, denote by S(V, R) the collection of simple 
functions; i.e. , functions of the form 


s(x) = ric^^(x) + ... + ri^CA.^(x) for all x £ R 


(5-2) 


where A^, A]^ are disjoint sets from the prering V; r^, •••, are 

real numbers; and c^ denotes the characteristic function of the set A. The 
set of simple functions is linear and the following functionals are well 
defined on it: 


Jsdv = viv(Ai) + ... + r^v(A|^) (5-3) 

lUll = Irilv(Ai) + ... + jr;^lv(Ai^) (5-4) 

The first functional is linear and the second forms a seminorm on S(V, R). 
Moreover, iJsdvj ^ l|s|| for all simple functions. 

Denote by N the collection of all sets A of R such that for every 
£ > 0 there exists a countable family A^ ^ V (t £ T) such that the set 
A is contained in the union U-^^A^ and Z.jiv(At-) < e. Sets of this collec- 
tion N will be called v-null sets. 

A sequence £ S(V, R) is called basic if there exists a sequence 

of simple functions and a constant M 'such that S|;^ = kj^ + k2 + ••• + k,^, 

l|knll < M4“^ for all n. Denote by L(v, R) the set of all functions f 
for which there exists a basic sequence s^ and a null set A £ N such 
that the sequence of values s,^(x) converges to the value f(x) if x ^ A. 

Define Mf|l = ||snll, /fdv = lim /s^dv. According to paper 2, these 

are well-defined functionals on L(v, R); and the space L(v, R) coincides 
with the space of Lebesgue summable functions with respect to the Lebesgue 
measure p, which is the smallest complete measure extending the volume v 
(see ref. 5). Moreover, the two integrals, Jfdv and /fdp, coincide. 

b 

In the sequel, we shall write J f(x)dF(x) to denote the integral 


Notice that the classical Lebesgue integral is generated by the function 
g(u) = u for all u e R, which corresponds to the volume v (<a,b)) = b - a 
on the prering W of all bounded right side open intervals. 
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We shall say that a function f is v-summable on a set A, or equiva- 
lently that the integral fdv exists, if and only if c^f ^ L(v, R)* 

5 . 1 THEOREM 

Let F be a probability distribution over R and G its computeriza- 
tion. Then, if the right-hand integral in the following formula exists in 
Lebesgue* s sense, so does the other and they are equal 


f(x)dF(x) = J f(G(u))du 
0 


(5-5) 


5.2 REMARK 


The theorem is valid for the Riemann-Stieltjes integral when the func- 
tion F is continuous and invertible. Notice that in theorem 5.1 each func- 
tion, F and G, may have an infinite number of discontinuities and neither 
has to be invertible. 


5.3 REMARK 


Let V be the volume generated by the distribution F and ]i the 
classical Lebesgue measure over the interval (0,1). The above theorem is 
equivalent to the following. The map f foG imbeds isometrically the 
Lebesgue space L(v, R) into the space L(U, R) . 


Proof of the theorem. Let f 




Then toG = 


In- 


deed foG(u) =1 if and only if G(u) € (-o°,a); i.e., G(u) < a, which is 
equivalent, as proved in section 2, to u e (O, F(a)); i.e., to 
c(o F(a))(^) “ Thus, the characteristic function of an interval <a,b) 
is mapped into the characteristic function of the interval <F(a), F(b)). 
Indeed, 


^a,b)oG = (c(^^b) - c(_co^a))oG = c(^^a)oG - c(_oo,b)oG 

= c(o, F(b)) - c(o, F(a)) = ‘^F(a), F(b)) (5-6) 

Similarly, the characteristic function of (-^,b) is mapped into the char- 
acteristic function of (0, F(b)) and the characteristic function of <a,®°) 
into the characteristic function of <F(a), 1). By the definition of the 
volume V on the prering V and of the Lebesgue measure y, we get 


I 
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v(l) = y(G”^(l)) for every I e V, since CjoG = is equivalent to 
A = G“^(l)* These observations yield the equality 


(5-7) 


Jjsldv = /] soG d]i for all s G S(V, R) 


Let W be the collection of all left side closed subintervals of the 
interval (0,1). It follows from the definition of a v-null set that if 
A is a v-null set then the y measure of the set B = G“^(A) is zero. No- 
tice that if s^ G S(V, R) is a basic sequence convergent for all x ^ A 
to the function f then the sequence s^oG belongs to the set S(W, R) 
and converges for all points u ^ B to the function foG. Thus, from 
the Bogdanowicz definition of the spaces L(v, R) and L(]i, R), we get 


/|f|dv = lim /|s^|dv = lim J|s^oG|dy = /|foG|dy 


(5-8) 


for all functions f G L(v, R) . This proves the theorem. 


5 . 4 COROLLARY 


For every bounded Borel function f and every a G R the following 
equality holds ; 


a F(a) 

f f(x)dF(x) = f f(G(u))du (5-9) 

-oo 0 


Proof . It follows from the properties of v-measurable functions (ref. 3) 
that every Borel measurable function is v-measurable. Since C(^oo is a 
Borel function, the product g = c(_oo^a)f also is a Borel function. Being 

v-measurable and bounded by the simple function Mcj^ = Mc(_oo + Mc<^ for 

some M, the function g is v-sunwiable. Since goG = c^q F(a))^^^’ 
from theorem 5.1 the equality 


a F(a) 

J f(x)dF(x) = / f(G(u))du (5-9) 

-00 0 
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6.0 EXISTENCE OF TRANSITION PROBABILITIES 


Let f 2 j •••> fit be real random variables over a probability space. 

We shall prove that the following conditional distribution 

P{fl < ai|f 2 * 32, f3 = 33 , = aij} ( 6 - 1 ) 

can be well defined as a Borel function of the vector a = (aj, a 2 , 
over the space R^. 

Let denote the joint distribution of the variables f 2 , ..., f^jj. 

Each distribution function F obtained in this way is nondecreasing with 
respect to the relation a < b on R™ defined to mean a^ < b^ for all 
i = 1, ..., m. This means that if a < b then F(a) _< F(b). Moreover, 
the distribution function is continuous with respect to increasing conver- 
gence; i.e., the condition a^ f a^ for j = 1, ..., m implies that 

F(ai, 3 . 2 ^ •••> • 

Finally, the function F is normalized; i.e., F(-o°, ..., -«>) = 0, 

F(a>, . . . , = 1. 

According to Kolmogorov's theorem, these properties characterize a joint 
distribution function F; i.e., for every such function there exists a unique 
Borel probability measure P over R™ such that 

F(ai, ...9 = pfsj^ < aj^, ...9 e^ < a^} (6—2) 

for all (aj^, ...9 e R®, where e^ are projection functions defined by 

e (ax 9 .-.9 %) = a4 for all (a^, ..., aj^) e R™ 

j 

To make the presentation more general, it will be convenient to intro- 
duce the following notation. If M and K are subsets of the set 

{1, ..., k} and K is a proper subject of M, then we will write K < M. 

Subsets of this form will be called indexes . The symbol |k| will denote 
the number of elements of the set K. 

We will denote by R^ the space of all vectors x = where 

X|- e R denotes the component of the vector x with index t. If M and K 

are two disjoint index sets and their union is S = M t) K, then the space R^ 

can be identified with the product space R^ x R^ and every vector x € R^ 

can be written in the form x = (x^, x^), where xj^ e R^ and x^ e R^. 
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If a G rT, we shall denote by l(a) the Cartesian product 
Xt£fr^”°^ at), where a = (^t^tgj* Sets of this form will be called in the 
sequel basic cones. 

Now if Ff is a probability distribution on and p^ is the cor- 

responding Borel probability over R^ obtained from Kolmogorov's theorem 
(i.e., F-p(a) = p^(l(a)) for all a e R^), for disjoint decomposition of 
the index set T into nonempty sets S and U the Borel probability p 
is well defined by the formula ^ 

Pg(A) = p^(rU X A) (6-3) 


for all Borel subsets A of the space rS. 

This probability in turn generates a probability distribution Fg* A 
S 

function p^ will be called a transition probability from probability Pg to 

probability p-j if its value x) is defined for every Borel set A 

being a subset of the space R^ and every x e R^. Moreover, the value 

Py(A, x) as a function of set A is a probability measure for every x g 

and as a function of point x is Borel for every fixed Borel set A. 

Furthermore, 


p^(A X B) = Jb Py(A, x)pg(dx) (6-4) 

for every Borel subset A of R^ and every Borel subset B of (see 

ref. 10, p. 73). 

6 . 1 THEOREM 


For every Borel probability p^ over rT, where T is finite, and every 
generated Borel probability pg, where S is a subset of T and the differ- 
ence set U = {x G T: x S } is nonempty, there exists a transition 

S 

probability p^ from the measure pg to the measure p^j;. 
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Proof * For every fixed Borel set A contained in R^, let denote 

the measure defined by the formula 

qA<B) = p^(A X B) (6-5) 

for all Borel sets B contained in 
Since 


qA(B) < Pt(R^ X B) = Pg(B) (6-6) 

for all Borel sets B in the space rB, we get from the Radon-Nikodym 
theorem (see refs. 6 and 10) that there exists a Borel function f^ 
sunmable with respect to the measure p^. After a modification on a Borel 
set of p -measure zero, we get 

D 

0 ^ X e rS (6-7) 


and 


qA(B) = Jg fA(x)pg(dx) (6-8) 

for all Borel sets B contained in the space rB. 

If be R^, let l(b) denote the Cartesian product x Every 

such set 1(b) is a Borel set. A vector b e R^ will be called rational if 
all its components are rational. We shall write a < b for two such 

vectors if and only if a^. < b^. for all t G U. 

Let h be a function given by the formula 


h(a, x) = sup{o, fj(|j)(x): b < a; b is rational} 


(6-9) 


for all a G R^ and all x G R^. The set following the supremum operation 
is nonempty since it contains zero and is bounded. Thus, the function h by 
the axiom of continuity is well defined and is a Borel function in variable x. 

Since both the measure and the integral are continuous under increasing 
sequential convergence, we get from (6-8) and (6-5) the relation 


p^(l(a) X B) = Jg h(a, x)p^(dx) 


( 6 - 10 ) 
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for all a e RU and all Borel sets B of rU. Moreover, from the defi- 
nition of function h we get that, for every fixed x, it is nondecreas- 
ing; i.e., if for two vectors a, c e R^ we have a < c, then 
h(a, x) ^ h(c, x). It is also left side continuous; i*e., if a^ < a 
for all n and the vectors a^^ converge to the vector a, then the se- 
quence of values h(a^, x) converges to the value h(a, x). 

Now from relation (6-10), using the monotone convergence theorem, we 
get the following relations: 

Pg(B) = /b h(«>, x)ps(dx) (6-11) 

0 =/b h(-«>, x)pg(dx) (6-12) 


for all Borel sets B in R^ , where h(«>, x) denotes the limit in the .vari- 
able a all of whose coordinates tend to <». The value h(-<», x) is under- 
stood similarly. Since 


h(oo, x) = lim h(a^, x) for all x e R^ 


(6-13) 


where a^ is an increasing sequence of vectors such that each component a 
tends to infinity, we get that h(o°, x) as a function of x is Borel meas 

urable on the space R^ . 

Thus, from Radon-Nikodym theorem there exists a set C of Pg-measure 
zero such that h(oo, x) = 1 and h(-oo, x) = 0 if x gf C . 


Modifying h on this set by putting h(a, x) = g(a) for all x e C, 
where g is any probability distribution on R^, we get that the value 
h(a, x) as a function of x is a Borel function for every a and h(a, x) 
as a function of a is a probability distribution for every fixed x. Thus, 


S 

by Kolmogorov's theorem h generates a unique probability measure Py(A, x) 
defined on all Borel sets A of R^ for every fixed point x e R^. 


Let us prove that for every fixed Borel set A of rU the following 
two properties hold: 

S 

A. The function p^(A, x) as a function of x is Borel measurable 
over the space R^. 

B. For every Borel set B of R^, we have 
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p^(A X B) = Jb Pu(A, x) ps(dx). 


I rtp 



To this end, denote by M the collection of all Borel sets A of 
for which properties A and B hold. 

Observe that the sets l(a) belong to M. Since M is closed under 
disjoint finite union, it is also closed under proper differences; i.e., if 
Aj is a subset of A 2 and both sets A]^, A 2 are in M, then also the dif- 
ference set A = {x G X t A 2 } is in M. Thus, if V denotes the 

prering consisting of all intervals of the form (-°o,a), and <b,a), T^ere 
a and b are real numbers, and denotes the prering consisting of all 

Cartesian products of the form A = ^t> ^are e V for every 

t e U, we can prove by induction with respect to the number of bounded in- 
tervals A^ appearing in the representation of the set A that the prering 
W = is contained in the collection M. Finite disjoint unions of sets 
from the prering W form the smallest ring containing W. 

Observe that M is closed under monotone convergence of sets. Accord- 
ing to a theorem of Halmos (ref. 8), this implies that M contains the small- 
est sigma ring generated by W. We can prove that this sigma ring coincides 
with the sigma ring of all Borel sets of the space R^. This concludes the 
proof of the theorem. 


6.2 THEOREM 

S S 

Let T, S, U be as before and q^, p^ be two transition probabili- 
ties from probability pg to probability 

There exists a Borel set C of pg-measure zero such that 

Py(A, x) = q^(A, x) (6-14) 

for all Borel sets A of R^ and all x ^ C. 

Proof . It follows from the definition of a transition probability and 
from the Radon-Nikodym theorem that for every Borel set A of R^ there 
exists a Borel set C(A) of p^-measure zero such that 


Py(A, x) = q^CA, x) (6-15) 

for all X i C(A). Let D denote the set of all rational points b of the 
space R^ and let C be the union of the sets C(l(b)) over all b e D. 

Clearly, C is a Borel set of pg-measure zero. 
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Denote by M the collection of all Borel sets A of such that 

S S 

Py(A, x) = q^(A, x) ■ if X 8 C. This collection contains basic cones l(b), 

where b is a rational vector. It follows from the monotone continuity of 
a measure that M contains every set l(a) for any a G R^. The rest of 
the argument is the same as in theorem 6.1. This concludes the proof. 

S TT o 

Let a function be defined on the product R^ x rS such that for 

5 

every fixed x e R^ the value F^Ca, x) considered as a function of the 
variable a is a probability distribution and for every fixed a e R^ 

g 

Fy(a, x) considered as a function of x is a Borel function. Such 
S 

a function will be called in the sequel a transition distribution . 

6 . 3 THEOREM 

There is a one-to-one correspondence between transition probabilities 

S ..... S 

q^ and transition distributions F^. This correspondence is given by the 
relation 


q^(l(a), x) = Fy(a, x) (6-16) 

for all a e R^ and x g R^. 

Proof . It is clear that every transition probability generates a tran- 
sition distribution by means of formula (6-16). To prove that every tran- 
sition distribution generates a unique transition probability, take an 

g 

arbitrary fixed point x e R® and denote by x) the value of the 

• • s 

probability measure defined for a Borel set A of R^ and generated 

S 

from the probability distribution F (a, x) by means of Kolmogorov's 
construction. 

g 

To prove that q^ is a transition probability, it is sufficient to 

g 

prove that for every fixed Borel set A the value x) as a function of 

X e rS is Borel measurable. To this end, denote by M the collection of 
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all Borel sets A having that property* Notice that all basic cones l(a) 
belong to M. Notice that M is closed under disjoint finite union and under 
the monotone convergence of sets. Thus, as in the proof of theorem 6.1, we 
conclude that M coincides with the sigma ring of all Borel sets of R^. 

This concludes the proof of the theorem. 


7.0 RESOLUTION OF BOREL PROBABILITIES 


Let T be a finite index set and U, S its disjoint decomposition 

into nonempty sets. Let p be a Borel probability over r"^ and p the 

T o 

corresponding probability generated over R^. 

The theorems of the previous section show that the measure p^ gener- 

3 

ates almost unique representation of by means of and p^ through 

the formula 


Px(A X B) = Jb qy(A, x)pg(dx) (7-1) 

for all Borel sets A of and B of . Conversely, any pair q!*, p 

U o 

consisting of a transition probability and a probability on Borel sets gener- 
ates a unique Borel probability over the space R*^ (ref. 10, p. 74). 

It follows from Kolmogorov* s theorem that condition (7-1) is equiva- 
lent to 


p^(Ka) X 1(b)) = Jj^^^qy(l(a), x)pg(dx) (7-2) 

for all a e rU and all b £ R^. 

The necessity of condition (7-2) is obvious. To prove its sufficiency, 
fix the point a £ R^ and consider two measures 


ri(B) = p^(l(a) X B) 


(7-3) 


r 2 (B) = Jb qy(I(a), x)pg(dx) 


(7-4) 
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for all Borel sets B of . Since these measures coincide for every basic 
cone according to relation (7-2), they must coincide for B = R^. If 
rjCR^) = 0, then both measures are identically zero on all Borel sets B 

of rS. If r = r]^(RS) > 0, then dividing the measures and V 2 by the 

value r we get two probability measures that coincide on all basic cones 
l(b). Thus, these two probability measures have the same distribution func- 
tion. By Kolmogorov's uniqueness theorem, this implies 


rj^(B)/r = r2(B)/r 


(7-5) 


for all Borel sets B of rU. Hence, 


p^(l(a) X B) = /b qy(l(a), x)pg(dx) 


(7-6) 


for all Borel sets B of R^ and all basic cones l(a), a £ R^. 

Holding the Borel set B fixed, by a similar argument to the preceding 
one, we get relation (7-1) for all Borel sets A of R^ and all Borel sets 

B of rS. Now let us introduce the following notations for points in the 
spaces r"^ and R^. If a C R^, then by ag , where S is a subset of T, 
we shall denote a point in the space R^ such that its component having in- 
dex t G S coincides with the component of a. Thus, we have a = a*p. 

When S = { t} , we shall write a^- instead of ^{t}* ^ ^ dis- 

joint nonempty sets whose union is the set T, then the vector a-p can be 
identified with the pair (ay, ag). Notice also that I(a-p) = I(ay) l(ag). 
Using this convention, we can write relation (7-2) in the equivalent form 


p.j( I(ax) ) J"x(a ’ 

o 


xs)pg(dxs) 


for all a £ R^. 


(7-7) 


It is convenient to introduce a shorthand notation and convention simi- 
lar to Einstein's convention in tensorial calculus. Namely, the relation 
given by formula (7-1) we shall write as 


P 


T 


= qS 


(7-8) 
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This will mean that whenever in such a formula a superscript index set 
S coincides with the subscript index set, we have an integration over a 
Borel set with respect to the variable X 5 , 




c 

Notice that the operator E mapping a pair (q^, 
preserves convex combinations in both variables 


p ) into the element 

.na ... 


s 

it is natural to extend it by homogeneity and linearity onto the space 
of all Borel transition measures (since we will not use this space in the^ 
sequel, we leave it to the reader to give precise definition to this space) 
and onto the space Bg of all finite Borel measures over the space 


S 

Now let us extend the definition of the transition measure p^ to in- 
clude the case when either S or U is empty. It is clear that when 

S . 

U = 0 the transition measure p^ is only a point function and the relation 
S S 

q^ = pyPs means that p^ is the Radon-Nikodym derivative of the measure 

qg with respect to the measure Pg. When S = 0, the transition measure 
S 

p^ does not depend on the point and thus is a probability measure. Thus, 

0 

we assume p^ = p^. 

3 

Finally, let p = 1. Then the relation q = p„p„ defines uniquely 

U i U o 

the element q^ for any disjoint decomposition S, U of the index set T. 

If the index set S consists of a single point t, we shall write q^ 

instead of and similarly for the set U. Let T = { 1, 2, ..., n} . 

The notation T(j) = {k: k < j} will be used for j = 1, 2, ..., n+1. 

Notice that T(l) denotes the empty set. A sequence of transition 
probabilities 


for j = 1, 2, ...» n 


will be called a resolution of the probability p^ if and only if 


T( 1 ) 

PT(j+l) " PT(j) j = 1, 2, n 


(7-9) 
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It follows from the theorems in the previous section that such transition 
probabilities exist and are almost unique; i.e., is unique up to a 


set of p_, , .-measure zero. 

T(j) 

Notice that the value of the transition probability qT^^^A, x) gives 
precise meaning to the conditional probability ^ 


e Ajfij = ak for k = 1, 2, j-l} 


(7-10) 


if the probability measure p^ is generated by joint distribution of the 
functions f^. (t e T). Since^by theorem 6.1 there exists a transition proba- 


bility 


Ps(j) 


from the measure 


to measure 




we can write 


_ 1 

Pt(j) Ps(j)Pl 

for j = 1, n, where S(j) = {2, 3, j-l}. 

7.1 THEOREM 


(7-11) 


T ( i ) 

If the sequence q^ (j = 1, ..., n) represents a resolution of the 
probability p^, then there exists a Borel set C of pj^-measure zero such 
that for every fixed value X]^ C the sequence 


xg(j)^ xi) 

for 2 = 2 , . . . , n, as a function of A and the remaining variables 

Xj. (t ^ 1), represents a resolution of the Borel probability pg defined 

by the formula 


p„(A) = p^(A, x^) 


for all Borel sets A of rS, where S = { 2 , 3, ..., n}. 


(7-12) 
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Proof * 
we have 


From the definition of the resolution of the probability p-p, 


^T(j+l) 



for j = 1, 2, n (7-13) 


It follows from the reduction formula for integrals with respect to 
a probability measure generated by a transition probability (see Neveu, 
ref. 10, p. 74) that 


Px(j+1) 


= J 
= f 


l(aT(j)) 


T( j) 

qj (Kaj), 




T(j) 


XT(j)> PT(j)(dxx(j)) 
(I(aj), xs(j), xi) 




•^l(ai) ’^S(j + l)(l(as(j + i)), xi) pi(dxi) 


(7-14) 


where rg(j^.j|^) denotes the transition probability satisfying the condition 


rj(j+l)(l(as(j+i)), xi) 


= ^ qj^^Vl(aj), xg(j), xi) ps(j)(dxs(j), xi) (7-15) 

I(as(j)) 

This expression as a function of (according to ref. 10, p. 74) 

represents a Borel measurable function and as a function of the variable 
as(j + i) represents a probability distribution for every fixed x]^. Hence, 

by theorem 6.3, it determines a unique transition probability rg(j + ]^). In 
this way from formula (7-14) we obtain the representation 


1 

^T(j+l) ~ ^S(j+1) PI 


(7-16) 
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It follows from the uniqueness of transition probability from Pr(j+1) 
to pif theorem 6.2, that for some Borel set C of pj^-measure zero we 
have 




(7-17) 


for all a 


e and all xj^ ^ C and j = 2, n. 


S(j+1) 

Equalities (7-15) and (7-17) yield 


■ /iCasCj)) *S(j)> *1> Ps(j)<'*='S(j)> *1> 018) 

For a fixed xj it C, define the sequence of probabilities by 


Ps(3)'*’ ■ '■s(j)'*’ =‘i> 


(7-19) 


for all Borel sets A of rS(j) and j = 2, n, and define a sequence 

of transition probabilities by the formula 


S(j), , T(j), 

Qj (A, xg(j)) = q^ (A, xg(j), xi) 

for all Borel sets A and j = 2, n. 

We can prove from relation (7-18) that 


(7-20) 


Ps(j+1)^^ ^ = Ps(i)^^^ 


S(j)' 


(7-21) 


for all Borel sets A of R^Cj) and j = 2, 3, n. Relations (7-18), 


(7-19), (7-20), and (7-21) prove that the sequence Q 

^S(n+1) * 


S(j) 


represents a reso- 


lution of the probability p . v - p . This completes the proof of the 


theorem. 
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fj. (t € T) over 


8 • 1 THEOREM 

3 

Every conditional distribution where t ^ S, is a Borel func- 

tion over the space R^. 

Proof * Let Z = {b^, b2? •••} be the set of all rational points and ^ 

g 

Zn = . For every fixed b E Z, the value F^(b, as) as a func- 

tion of as on R^ is Borel, as follows from its definition. Define the 
function by the formula 

Hxi(at, as) = sup{o, Ft(b, 35 ): b e Z^, b < at} (8-3) 

for all at e R and as ^ R^. 
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These functions are well defined and the domain of the variable a^ e R 
can be split into a finite number of disjoint intervals Ij (j “ 0, 1, n) 

by means of the points of the set so that for each of these intervals 

the function Hj^(a^., ag) does not depend on a^ and is Borel in the vari- 
able ag e rS, This follows from the fact that Borel functions are closed 
under the finite supremum operation, and that on the interval Ij only a 

finite number of elements of the set is smaller than a^.. Thus, every 

function can be represented in the form 


^s'> = Gjn(as) 


(8-4) 


where Gj^ are Borel functions on and cj is the characteristic func- 

tion of the interval I. Since the characteristic function of an interval is 
a Borel function and the composition of a Borel function with a continuous 
function is a Borel function, we may consider the value c;^(a^.) = c^ o 
as a function of ay, where e|- is the projection function defined by 


6t(au) = for all ay e R^> as a Borel function over R^. Similarly, we 

may consider the function Gj^(ag) = Gj^ o eg(ay) as a Borel function over 


.U 


Since Borel functions are closed under multiplication and addition, the 
functions are Borel over R^* Now notice that 


Ft(at? ag) = lim H^Ca^, ag) for all a e R^ (8-5) 


g 

Thus, the function is a Borel function over the space R^« This com- 

pletes the proof. 

Notice that every probability distribution Fg on R^ generates a 

unique volume v on the product prering consisting of all sets of the 

form A = X ^.^g where A|- e V for all t e S, and V consists of all 
intervals of the form (-«>,a) and <a,b) (see ref. 5). The Bogdanowicz 
integral with respect to the volume v coincides with the Lebesgue integral 
generated by the probability measure pg. 
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Thus, the integral Jf(x 3 )v(dx 3 ) is uniquely determined if the distri- 
bution function F 3 is known. By an integral with respect to the distribu- 
tion F 3 , we shall understand 


/f (xg)Fs(dXg) = /f(x)v(dx) 


( 8 - 6 ) 


t( i ) 

Consequently, each function Fj from the sequence representing the resolu- 

tion of the distribution F-p is uniquely determined almost everywhere with 
respect to the measure generated by the distribution Fx(j) and we have 


F (a ) = J 
T(j+1) T(j+1) I(ax(j)) J 


FT^-^\a., X )F (dx ) 
J T(j) T(j) T(j) 


(8-7) 


for all ®x(j+l) ^ j = 1> •••» a. These formulas can be used 

to find the resolution. Again we may write for the sake of brevity 
T( 1 ) 

Ft(j+i) “ ^T(j) case of transition probabilities. 

9.0 SIMULATION THEOREM 

Let N = { 1 , 2, ...} and N(j) = {k £ N: k < j}. Let FN(n+l) be a 

N(j) 

probability distribution over and let Fj (j = 1, . . . , n) be a reso- 
lution of the distribution FN(n+l)* ^ 1 ? ^ 2 ? •••> ^n random vari- 

ables over some probability space whose joint distribution is F^(n+ 1 )? then 

is a Borel function making meaningful the conditional distribution 


p{fj < a j 1 f = a^^. for all k e N(j)} = Fj^'^^a]^, ..., a j ) (9-1) 


f or all (aj^, ^ j ^ ^ R J , j — 1 , •••, n * 
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If G is a furiction of j variables a]^,’ . . . , aj being a probability 
distribution with respect to the variable a]^, denote by the computeriza- 

tion operator acting on the k— th variable; i.e*, H = cj^^CG) is defined by 

HC ^1 j • • • j ^k ^ • • • 5 ^ j ^ x.nf ^ ^k * ^k ^ ^ C ^l y • • • ? ^k ^ * * • > ^ j ^ ^ C 9 — 2. ) 

for all aj^ j • • • ^ ^k— 1? ^k*i*l? • • • j ^ ^ and ^k ^ COy 1) • 

N(j) 

The sequence Hj ( j = 1, . . . , n) of functions defined by Hj = CjCl’j ) 

for j = 1, n will be called a computerization of the distribution 

%(n+l)‘ 

9 . 1 LEMMA 

Each function Hj for j 1, 2, 3, •••y u is a Borel function. 

Proof . For j = ly the proof is obvious since Hj is monotone. Take 
any j > 1. First, let us prove that Hj is Borel in variables 

aj^, ..., aj_]^. To this end take any number a S R. Let G denote the 

function defined by 


GCaj^, . . . , aj_ 2 ^) Fj (aj^, • • . , aj_ 2 ^, a) (9 3) 

for all aj^, ..., aj_]^ ^ R. From the equality of the sets 

iCa]^, ..., aj— HjCu]_, •••y ^j— ly ^ 

1 C , . . • , a j— ) 5 u ^ GC aj^ , • . • , a j — ) ]* C 9 —A-) 

and the fact that the function G is Borel measurable, it follows that the 

function Hj is Borel measurable in the first j-1 variables when the jth 

variable u is fixed. 
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since in the variable u the function Hj is monotone and right side 
continuous, we may conclude that Hj is Borel with respect to all its vari- 
ables jointly, as we concluded in the proof of Borel measurability of a condi- 
tional distribution in theorem 8.1. 


9.2 THEOREM 


Let Hj ( j = 1 , . . . , n) be a computerization of the distribution 
^N(n+D* Define recursively the variables 

XI = Hi(ui) 

X2 = H2(xi, U2) 

X3 = H3(xi, X2, U3> 


Xn = Xn-i, u^) (9-5) 

where ux, U 2 , Uj^ are independent random variables with uniform distri- 

bution over the open interval (0,1). Then the joint probability distribu- 
tion of the variables xx, •••> coincides with the distribution ^NCn+l)* 

Proof . Since each function Hj is Borel, we can prove by induction 
that each variable xj as a function of the variables uj is also Borel. 
Thus, Xj as functions of variables ux, U2, ..., u^ are Lebesgue measur- 
able. To find their joint distribution, we have to compute the Lebesgue 
measure of the set 

D(a) = {(ux, Ujx) ^ xj < aj for j = 1, . . . , n} (9-6) 

where I = (0,1). From the definition of the variables xj and the proper- 
ties of the computerization operator, we get the identity 


D(a) * {u € uj < Fj^'^^xx, Xj--x? ^j) j ~ i? (9-7) 
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We will prove the theorem by induction with respect to n. For 
n = 1, we have 


D(a) = {u e I: u< = (0, 


C9-8) 


N(l) 

and the Lebesgue measure of this set is p(D(a)) = F]^ (a) = 

all a £ R. Assume that the theorem holds for n = k ~ 1- Notice that the 
set D(a) can be represented in the form 


DCa]_j • • • j ^n^ 


“ u £ I^i uj^ ^ F]_ j Cu2j •••? ^n^ ^ ^X] ^^2? •••? ^n^^ 


XI' 


where 


^X]^ ^^2? • • • ? ^n^ 


= {u9, . 


^ «-1 X 

Un^) e 1°- uj < Fj Cx]^, X£, ..., xj_i, ajj 

for j = 2, 3, . . . , n} 


Notice that for almost all = H^Cai) with respect to the measure gener- 

ic j) 

ated by the distribution F]^ the fimctions Fj as functions of the 

1 

remaining variables form a resolution of the distribution fs(n+l) which 

the value of the first variable is fixed to be x^. FromFubini's theorem, 
we get 


r I’lCa]^) p 

p(D(a)) =J •••j ‘l^n) <^“1 

0 ^^1 ^^2 J • • • 5 ^n^ 

pFi(ai) 1 

= J ?S(n+l) & 2 > •••J dui 


(9-10) 
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From corollary 5.4, we get 


p(D(a)) 



• • • , 


an) FjCdx]^) 


(9-11) 


P 

Finally, from the properties of a conditional distribution, we get 
p(D(a)) = Fj^(n+l) (^i, an) for all a e R^. This proves the theorem. 


10.0 EXAMPLES OF SIMULATION 


Let p be the Borel measure generated by means of Kolmogorov's con- 
struction over from a joint probability distribution F of real random 

variables f f 2> • • • j fn* 

Such a sequence of random variables we shall call a random process , and 
the smallest closed set S in R^ whose complement has p-measure zero, we 
shall call the spectrum of the process . That the spectrum is well defined 
follows from the fact that every open set in R^ is the union of a countable 
family of open spheres having rational centers and rational radii. Thus, the 
union of all open sets of measure zero is a set of measure zero. The comple- 
ment of that set is the spectrum. Clearly, to define a random process, it is 
sufficient to define the probability measure p over the spectrum of the 
process. 

10.1 PROBLEM 

I 

! 

Given is a steady flow of elementary particles through a region S in 
the form of a unit disk as in the following sketch. 
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The intensity of the current i per unit of area at a point (x, y) is 
given by the formula 


i(x, y) = 1 + X 


( 10 - 1 ) 


Consider as a random process the polar coordinates (X, (j)) of the point at 
which a particle arrives. Find a computerization of the process. 


Solution . The spectrum of the process is the closed circle S. The 
probability density f that a particle arrives at a point with coordinates 
(x, y) is proportional to the intensity of the current i at the point; 
i.e., f(x, y) = ci(x, y) for all (x, y) e S. This yields the equation 


i = Js y)dxdy = c Jg i(x, y)dxdy = cTV 


( 10 - 2 ) 


which yields c = 1 /tt, 

In polar coordinates, the set S has a representation 

S = {(r, 4>): 0 < r < 1, 0 < <j) < 2 tt} (10-3) 

neglecting in S several lines that have measure zero. Since dxdy = rdrd(|), 
the probability density g in polar coordinates is given by 

1 

g(r, (!>)=- r(l + r cos (j)) (10-4) 


Thus, computing the distribution on the spectrum S, we get 


/ ^ 1 2 13 , 

F ( ai , ^ 2 ) = — ^2 ^1 ^1 ^2 

2t\ 3tt 


(10-5) 
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for 0 < ai < 1 and 0 < a£ < 2TT. This yields the distribution 

\aj) = for all a^ e (0,1). To find the conditional distri- 

1 

bution F2> notice that the equation 


F(ai, a2) = J ^*1 


N(l) 


( 10 - 6 ) 


is equivalent to 

/ ^g(r, <t>)d<j)dr = / ^ f2(®l> a2)2aidai (10-7) 

0 0 0 

Differentiating this identity with respect to a^, we get the equation 


J^^g(ai, (l))d(|) = F2(ai, a2)2ai 
0 


1 


( 10 - 8 ) 


Thus, 


F2(ai, a2) - 


1 1 

J g(ai, 4>)d<l) = — (a 2 + a^ sin a 2 ) 

2aj 0 2H 


(10-9) 


To find the computerization of the process (X, 4>), notice that from 
the continuity of the distribution Fj on the interval <0,P we get 


ui = a^, or aj^ = (uj)l/2 


( 10 - 10 ) 


This yields 


Hi(ui) = (u]^)l/2 for all uj e (0,1) 


( 10 - 11 ) 
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1 

similarly, by continuity of F 2 (aj^, a 2 ) in the second variable 
a 2 ^ <0,27T>, we get the equation 


U 2 = — (a 2 + sin a 2 ) (10-12) 

27T 


This equation with respect to the variable a 2 is the Kepler equation. It 
can be solved either by iteration or by Newton's method. 

Both methods are easily programmable on a computer. Let a2 = H 2 (ax, U 2 ) 
be the solution of the equation as a function of a^^ e (0,1) and U 2 e (0,1). 
The pair (H]^, H 2 ) represents a computerization of the process; that is, the 
pair of variables X = Hi(uj^), <t> = H 2 (^, ^ 2 ), where uj^, U 2 are independent 
random variables with uniform distribution over the open unit interval (0,1), 
will simulate the process. 


10.2 PROBLEM 


Consider a chemical process generating ions. Assume that the random 
process consists of two variables (x, y), where x is the energy level of 
an ion and y is its life expectancy. Assume that the density of the proba- 
bility distribution of the process is given by the formula 


f(x, y) = h(y)e ^y 6(x + 2) + - (h(x) - h(x - l))h(y)e (10-13) 

2 


for all (x, y) C r 2. Find a computerization of the process. (Notice that 
6 and h denote here, respectively, Dirac's delta function with mass cen- 
tered at zero and its distribution function over R. ) 
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Solution , Notice that the spectrum of the process is given by the fol- 
lowing diagram. 



It consists of an infinite ray at the position x = -2 and an infinite strip 
above the x-axis such that 0 ^ x _< 1. 

Computing the distribution function F, we get 


F(ai, a2) 
F(aj, 32) 
F(ai, 32) 
F(a]^, 32) 


0 ^2 £ ^ ^ ^ 

0 if 32 > 0 and £ “2 

(1 - e"^^2)/2 if 32 > 0 and ^2 < aj^ £ 0 

(1/2)(1 - e‘^®2) + (l/2)(ai + a2-l(e'®l®2 _ j)) ^ (10-14) 

if 32 > 0 and 0 < aj ^ 1 

-2a 


F(ai, 32) = (1/2)(1 - e ^^2) + (i/2)(l + a 2 -l(e ®2 _ d) 

if 32 > 0 and aj > 1 


) 
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This set of equations yields the formulas 


N(l) 

Fi (ai, a2) = 0 


if a^ 5. “2 


N(l), 

Fi (ai, 32) 



if -2 < ai < 0 


N(l) 

Fi (ai, 


1 

32) “ ®1^ 


if 0 < ai < 1 


( 10 - 15 ) 


N(l)^ 

Fi (ai, 


32) 


1 


if ai > 1 


The spectrum of the variable x consists of the point x = -2 and the 
interval <0,1>. At the point x = -2, the measure generated by the distribu 

N(l) 

tion has mass 1/2, and on the interval <0,1> it has a linear mass 

density equal to 1/2. Using these properties, we get for the conditional dis 
1 

tribution F 2 the values 


F 2 (ai, 32 ) = 1 - e ^^2 


if ai = -2 


F2(aj^, 32 ) = 1 - e 2 


if 0 < aj < 1 


( 10 - 16 ) 


1 

Since outside the spectrum of the variable x the function F 2 may be 
defined arbitrarily, let us set 


1 

F2(ai, 32) 



if 3j 5 . 0 


F2(3i, 32 ) = 1 - e ^1^2 


if ai > 0 


( 10 - 17 ) 
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N(l) 

Fran the graph of the function Fj , we get the formula 


Hi(ui) = -2 


Hi(ui) = 2ui - 1 


if 0 < ui < 1/2 I 
if 1/2 < ui < 1 j 


(10-18) 


Since the function F 2 (ax 
a fixed value of the variable 
get 


a 2 ) is continuous in the variable a 2 for 
aj[ and invertible on the interval we 


H 2 (ai, U 2 ) = -d/2)log(l - U 2 ) if < 0 

H 2 (ai, U 2 ) = “(l/ai)log(l - U 2 ) if ai > 0 ^ 


(10-19) 


The pair of functions (H]^, H 2 ) represents a computerization of the 
process x, y. In the above example, we considered for the sake of simplic- 
ity a process having only one spectral line and one continuous areal compo- 
nent. The method used here can be easily extended to the case where the 
spectrum consists of a sequence of spectral lines and several two-dimensional 
components. 


11.0 EXPECTED VALUE OF A FUNCTION OF A PROCESS 


Many applications require computation of some parameters of a process, 
such as covariance matrix, moments, and characteristic function. These compu- 
tations require one to find the expectation 


E(f(xi, X2, xic)) 


( 11 - 1 ) 


where f is a sufficiently regular Borel function defined on the range (spec- 
trum) of the process x^, X2? • • • ? 

Computer simulation allows one to find the approximate values of the ex- 
pectation and to establish probabilistic bounds on the error of the expected 
value. This can be done by involving the central limit theorem if the func- 
tion f has a finite second moment. Treating the value 


y = f(xi, X 2 , ...» x^) 


( 11 - 2 ) 
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This set of equations yields the formulas 


N(l)^ 

Fj (ai, a2) = 0 


if ai < -2 


N(l) 1 

Fi (ai, a 2 > = — 


if -2 < ai < 0 


(10-15) 


N(l) 1 

Fi (ai, 32) = — (1 + a^) 


if 0 < ai < 1 


N(l)^ 

Fj (aj, 32) = 1 


if ai > 1 


The spectrum of the variable x consists of the point x = -2 and the 
interval At the point x = -2, the measure generated by the distribu 

N(l) 

tion F^ has mass 1/2, and on the interval <0,1> it has a linear mass 

density equal to 1/2. Using these properties, we get for the conditional dis 
1 

tribution F 2 the values 


F 2 (ai, a 2 ) = 1 - e ^^2 
F 2 (ai, 32 ) = 1 - e ®1^2 


if ai = -2 


if 0 < ai < 1 


(10-16) 


1 

Since outside the spectrum of the variable x the function F 2 may be 
defined arbitrarily, let us set 


F 2 (ai, a 2 ) = 1 - e ^^2 
F 2 (aj, a 2 > = 1 - e ®1^2 


if ai < 0 

if ai > 0 

^ / 


(10-17) 
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From the graph of the function 

N(l) 

Fx 9 we get the formula 

Hi(ui) = -2 

if 

0 < ui < 1/2 ' 

Hi(ui) = 2u^ - 1 

if 

> 

1/2 1 ui < 1 


/ 


(10-18) 


Since the function F 2 (ai, E 2 ) is continuous in the variable a 2 for 
a fixed value of the variable ax and invertible on the interval we 

get 


H 2 (ax, U 2 > = “(l/2)log(l - U 2 ) 
H 2 (ax, U 2 ) == -d/ax)log(l - U 2 > 


if ^ 0 

if ai > 0 


(10-19) 


The pair of functions (Hx, H 2 ) represents a computerization of the 
process x, y. In the above example, we considered for the sake of simplic- 
ity a process having only one spectral line and one continuous areal compo- 
nent. The method used here can be easily extended to the case where the 
spectrum consists of a sequence of spectral lines and several two-dimensional 
components. 


11.0 EXPECTED VALUE OF A FUNCTION OF A PROCESS 


Many applications require computation of some parameters of a process, 
such as covariance matrix, moments, and characteristic function. These compu- 
tations require one to find the expectation 


E(f(xx, X2, xk)) 


( 11 - 1 ) 


where f is a sufficiently regular Borel function defined on the range (spec- 
trum) of the process xx, X2> • • • > x^^. 

Computer simulation allows one to find the approximate values of the ex- 
pectation and to establish probabilistic bounds on the error of the expected 
value. This can be done by involving the central limit theorem if the func- 
tion f has a finite second moment. Treating the value 


y = f(xx> X 2 j •••y X]^) 


( 11 - 2 ) 
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as a random variable, one may find by simulation of the process sufficiently 
large stochastically independent samples of the variable y. The mean of the 
sample will approximate the expectation E(y). Since the mean for large sam- 
ples has approximately normal distribution, from the sample of the variable 
y one can easily estimate the variance of the mean, and thus get an idea of 
the accuracy of the estimate of the expected value. 


12.0 INTE GRAL FORMULA FOR EXPECTATION 

Let F be a probability distribution over R^ and H]^, H 2 , . . . , its 

computerization. Define a map G from the cube I*^, where I = (0,1), into 
R^ by the formula x = G(u), where 


XI = Hi(ui) 

X2 = H2(xx, U2> 

X3 = H 3 (xi, x2, U 3 ) 


•••> ^n-l> (12—1) 

for all u £ I^. Let p be the Borel measure corresponding to the proba- 
bility distribution F. Let m be the classical Lebesgue measure over the 
cube I^. 

12.1 THEOREM 


The map K defined by K(f) = foG establishes linear isometric imbed- 
ding of the Lebesgue space L(p, R) of summable functions into the Lebesgue 
space L(m, R) . 

The proof of the theorem is similar to the corresponding proof for one 
variable presented in section 5. 

Corollary . If the right-hand side integral in the following formula 
exists, then so does the other and they are equal: 

f(x)dF = f(G(u))du (12-2) 


Definitions of the integral are similar to those in section 5. 
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13.0 CONCLUSION 


The principal result of this paper is the proof of the existence of a 
recursive algorithm by means of which one can simulate on the computer any fi- 
nite sequence x^, X2? . . . , x^^ of random variables whose joint distribution 
F is known. These variables may be dependent and their joint spectrum may 
have continuous and discrete components. 

This result should be useful in applications requiring the Monte Carlo 
method, in particular in problems of quantum chemistry, nuclear and plasma 
physics , economics , and s tochas tic control sys terns . 

A word of caution is appropriate here. Since most computer languages 
use words of a fixed number of bits to represent numbers, the set of num- 
bers available in such languages is finite. Even if one used some set of 
computable real numbers, say all rationals, as one could define, for exam- 
ple, by means of the PL-language of Brainerd-Landweber (ref. 7), the set of 
all numbers available on the computer would be at most countable and thus of 
Lebesgue measure zero. Hence, it is always possible to find a pathological 
example of a distribution F whose computerization H cannot be simulated 
by a computer. However, in most applications the resulting computerization 
H consists of functions that can be well approximated by means of piece- 
wise continuous functions whose computational complexity is not too great. 
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